Upsortable: Programming TopK Queries Over Data Streams

نویسندگان

  • Julien Subercaze
  • Christophe Gravier
  • Syed Gillani
  • Abderrahmen Kammoun
  • Frédérique Laforest
چکیده

Top-k queries over data streams is a well studied problem. There exists numerous systems allowing to process continuous queries over sliding windows. At the opposite, nonappend only streams call for ad-hoc solutions, e.g. tailormade solutions implemented in a mainstream programming language. In the meantime, the Stream API and lambda expressions have been added in Java 8, thus gaining powerful operations for data stream processing. However, the Java Collections Framework does not provide data structures to safely and conveniently support sorted collections of evolving data. In this paper, we demonstrate Upsortable, an annotation-based approach that allows to use existing sorted collections from the standard Java API for dynamic data management. Our approach relies on a combination of pre-compilation abstract syntax tree modifications and runtime analysis of bytecode. Upsortable offers the developer a safe and time-efficient solution for developing top-k queries on data streams while keeping a full compatibility with standard Java.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه روشی پویا جهت پاسخ به پرس‌وجوهای پیوسته تجمّعی اقتضایی

Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...

متن کامل

Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries

Reverse topk queries are proposed from the perspective of a product manufacturer, which are essential for manufacturers to assess the potential market. However, the existing approaches for reverse topk queries are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods cannot be applied to pr...

متن کامل

Top-k Dominating Queries: a Survey

Top-k dominating queries combine the advantages of top-k queries and skyline queries, and eliminate their disadvantages. They return k objects with the highest domination score, which is defined as the number of dominated objects. As a top-k query, the user can bound the number of returned results through the parameter k, and like a skyline query a user-selected scoring function is not required...

متن کامل

CrowdK: Answering top-k queries with crowdsourcing

In recent years, crowdsourcing has emerged as a new computing paradigm for bridging the gap between humanand machine-based computation. As one of the core operations in data retrieval, we study topk queries with crowdsourcing, namely crowd-enabled topk queries . This problem is formulated with three key factors, latency, monetary cost , and quality of answers . We first aim to design a novel fr...

متن کامل

Handling ER-topk Query on Uncertain Streams

Data uncertainty widely exists in many applications. In this paper, we aim at handling top-k queries on uncertain data streams. Since the volume of a data stream is unbounded whereas the memory resource is limited, it is critical to devise one-pass solutions that is both timeand space efficient. In this paper, we use two structures to handle this issue. The DomGraph stores all tuples that are p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2017